10. Convolutions and rescaling
Contents
from IPython.display import HTML, display
# set path containing data folder or use default for Colab (/gdrive/My Drive)
local_folder = "../"
import urllib.request
urllib.request.urlretrieve('https://raw.githubusercontent.com/guiwitz/DLImaging/master/utils/check_colab.py', 'check_colab.py')
from check_colab import set_datapath
colab, datapath = set_datapath(local_folder)
10. Convolutions and rescaling¶
In order to learn the basics of neural networks as well as higher-level DL packages we used simple neural nets consisisting only of linear layers (fully connected) and activations. The information in images has however a specific structure which calls for other types of layers. In particular convolution plays a central role in this area.
From global to local information¶
When we linearize an image to make it pass through an linear layer, the underlying assumption is that all pixels connected to a given activation in a layer are equivalent to each other. This is obviously an oversimplification: in most cases, single pixels have a local context that gives them sense, and e.g. a pixel in a the upper left corner of an image is not much related to one in the lower left. Of course all pixels of an image of the sea are still related to each other in the sense that they belong to waves, are blue etc. while those of an image of a forest belong to leaves, are green etc. This global connection can also be taken into account by looking at coarse-grained versions of the image.
Convolution filters as well as rescaling operations can specifically recover the type of information mentioned above: convolution recover local information while rescaling allows us to do that at different scales.
Convolution¶
A convolution is simply image filtering: a small image, the filter \(f\), travels across the actual image \(I\) and performs locally a sum/product operation at each pixel, generating a filtered image \(F\). At pixel \(a,b\) in the image, we perform \(F_{a,b} = \sum_{ij}^{N,M}f_{ij}I_{a+i,b+j}\) where \(N,M\) are the filter dimensions. In other words, we generate a new image via a local operation at each pixel. Below you can find an illustration of this with a filter that is just uniform, with the result that the resulting operation is capturing the local mean in the image. Here the result is the smoothing out of sharp edges.
HTML(url='https://raw.githubusercontent.com/guiwitz/DLImaging/master/illustrations/convol_mean.html')